Developing a statistically powerful measure for quartet tree inference using phylogenetic identities and Markov invariants.

نویسندگان

  • Jeremy G Sumner
  • Amelia Taylor
  • Barbara R Holland
  • Peter D Jarvis
چکیده

Recently there has been renewed interest in phylogenetic inference methods based on phylogenetic invariants, alongside the related Markov invariants. Broadly speaking, both these approaches give rise to polynomial functions of sequence site patterns that, in expectation value, either vanish for particular evolutionary trees (in the case of phylogenetic invariants) or have well understood transformation properties (in the case of Markov invariants). While both approaches have been valued for their intrinsic mathematical interest, it is not clear how they relate to each other, and to what extent they can be used as practical tools for inference of phylogenetic trees. In this paper, by focusing on the special case of binary sequence data and quartets of taxa, we are able to view these two different polynomial-based approaches within a common framework. To motivate the discussion, we present three desirable statistical properties that we argue any invariant-based phylogenetic method should satisfy: (1) sensible behaviour under reordering of input sequences; (2) stability as the taxa evolve independently according to a Markov process; and (3) explicit dependence on the assumption of a continuous-time process. Motivated by these statistical properties, we develop and explore several new phylogenetic inference methods. In particular, we develop a statistically bias-corrected version of the Markov invariants approach which satisfies all three properties. We also extend previous work by showing that the phylogenetic invariants can be implemented in such a way as to satisfy property (3). A simulation study shows that, in comparison to other methods, our new proposed approach based on bias-corrected Markov invariants is extremely powerful for phylogenetic inference. The binary case is of particular theoretical interest as-in this case only-the Markov invariants can be expressed as linear combinations of the phylogenetic invariants. A wider implication of this is that, for models with more than two states-for example DNA sequence alignments with four-state models-we find that methods which rely on phylogenetic invariants are incapable of satisfying all three of the stated statistical properties. This is because in these cases the relevant Markov invariants belong to a class of polynomials independent from the phylogenetic invariants.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Markov invariants and the isotropy subgroup of a quartet tree.

The purpose of this article is to show how the isotropy subgroup of leaf permutations on binary trees can be used to systematically identify tree-informative invariants relevant to models of phylogenetic evolution. In the quartet case, we give an explicit construction of the full set of representations and describe their properties. We apply these results directly to Markov invariants, thereby ...

متن کامل

Markov invariants, plethysms, and phylogenetics (the long version)

We explore model-based techniques of phylogenetic tree inference exercising Markov invariants. Markov invariants are group invariant polynomials and are distinct from what is known in the literature as phylogenetic invariants, although we establish a commonality in some special cases. We show that the simplest Markov invariant forms the foundation of the Log-Det distance measure. We take as our...

متن کامل

ar X iv : 0 71 1 . 35 03 v 2 [ q - bi o . PE ] 2 6 N ov 2 00 7 Markov invariants , plethysms , and phylogenetics

We explore model based techniques of phylogenetic tree inference exercising Markov invariants. Markov invariants are group invariant polynomials and are distinct from what is known in the literature as phylogenetic invariants, although we establish a commonality in some special cases. We show that the simplest Markov invariant forms the foundation of the Log-Det distance measure. We take as our...

متن کامل

Quartets and Parameter Recovery for the General Markov Model of Sequence Mutation

Methods of inference of the evolutionary history leading to currently extant species, or taxa, have been transformed in recent years by the ready availability of biological sequence data such as that from DNA. While many approaches to this inference problem have been developed, some of the methods most appealing theoretically are so computationally intensive that they cannot be carried out exac...

متن کامل

Topological metrizations of trees, and new quartet methods of tree inference

Topological phylogenetic trees can be assigned edge weights in several natural ways, highlighting different aspects of the tree. Here the rooted triple and quartet metrizations are introduced, and applied to formulate novel fast methods of inferring large trees from rooted triple and quartet data. These methods can be applied in new statistically consistent procedures for inference of a species...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Journal of mathematical biology

دوره 75 6-7  شماره 

صفحات  -

تاریخ انتشار 2017